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Abstract 

We introduce and demonstrate two linear inverse modelling methods for systems of stochastic ODE’s with ac¬ 
curacy that is independent of the dimensionality (number of elements) of the state vector representing the system in 
question. Truncation of the state space is not required. Instead we rely on the principle that perturbations decay with 
distance or the fact that for many systems, the state of each data point is only determined at an instant by itself and 
its neighbours. We further show that all necessary calculations, as well as numerical integration of the resulting linear 
stochastic system, require computational time and memory proportional to the dimensionality of the state vector. 
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1. Introduction 


Consider the linear stochastic system of ordinary differential equations 


dx 

~At 


Bx + f 


( 1 ) 


where x is a dimensional vector that contains the state of the system at a particular time f, B is a matrix of constant 
coefficients and ^ denotes a vector white noise process. The notation is chosen following ||T]. We consider the case 
were the eigenvalues of B all have negative real parts and 


Q - fP) (2) 

is the noise covariance matrix which is symmetric positive definite. The angled brackets (...) denote the expectation 
value and T denotes the matrix transpose. Given a time series of n data points X;, i - 1... n, each representing x at a 
particular time f, our aim is to find B and Q. 


7.7. Applications 

The linear system (fn with B and Q found from data has been applied for many years to approximate the dynamics 
of non-linear systems |2l. In particular, analysis of the surface temperatures in the pacific (e.g. Q, 0) and atlantic 
(e.g. 111,0) oceans have been studied, with extensions to the sub surface dynamics (e.g. 0). A closely related 
approach is to solve the system of ocean governing equations on a computationally feasible grid, necessitating a 
higher viscosity. One or both of the terms in Q are added to the right hand side of the governing equations to 
approximate the sub-grid scale flow (e.g. 0, 0, 121, ITOl . IfTTII . IfT^ . 1T31 . lfT4ll '). The author’s motivation for this 
work is to And an improved estimate of B and Q and use ([T]i for this purpose. 

The time averaged statistics of a fluid flow may be found by integrating the governing equations for a sufficient 
length of time. Rather than solving an equation governing the instantaneous flow, several authors have considered 
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solving the equations for the statistics (e.g. lEi, m, csi). Such attempts require neglecting all of the 

cumulants beyond the first two, or parameterising missing terms with a linear stochastic term. The only system with 
two cumulants is a Gaussian system EOl and Q is a system that is capable of replicating a Gaussian probability 
density function (PDF). If such statistical equations are applicable then a method of closure is to use the statistics 
measured from a flow to estimate B and Q, and hence the governing system Q- 

If ([T]i with appropriate parameters can be used as an accurate model of the Earth’s oceans or atmosphere (or sub¬ 
grid model), then it may be applied to improve estimates of climate change (e.g. ED), although the utility of such 
linear estimates may be qualitative m . Adding df to the right hand side of Q to represent a constant forcing causes 
the time mean, or climatological mean, of x to change. Denoting this change {6x) we get 

<dx> = 

The form of forcing given a particular response can also be found by rearranging to get 

df = -B (dx). 

Accurate representation of 0 also has potential for use with statistical significance testing. A common hypothesis 
to test is that some data is significantly different from uncorrelated Gaussian random noise. The appropriate test in this 
case is Students t-test. The assumption of independence required by Students t-test is not satisfied in the case of data 
that is correlated in time and the test is not appropriate. There are alternatives (see mi one is to use Monte-Carlo 
integration to test that the time series of data is significantly different from a first order auto-regressive process, the 
discrete analogue of 0 with dimension d = 1. However much of the science of the ocean and atmosphere involves 
the analysis of large data sets that are both spatially and temporarily correlated. Both spatial and temporal correlation 
can be included in the null-hypothesis model by considering 0 with d > 1. 


1.2. Linear Inverse Modelling (LIM) 

Multiplying 0 by x(0)^, taking the expectation value and solving the system of ordinary differential equations, 
gives the lag r covariance matrix 


C(t) = exp(BT)C(0) 
= (x(t)x(0)T) 


n — m ^ 
1=1 


yx,.„xT, 


(3) 

(4) 

(5) 


where the matrix exponential, 


oo ^ 

exp(B) = y-B 

k=a 


is used and m is the number of data points within the time t. The covariance matrices may be estimated from the data 
and by rearranging 0,B can be expressed in terms of the covariance matrices. 


B= ilog[C(T)C(0)-'] 


(6) 


where the matrix logarithm (principal value of the inverse matrix exponential) is used and B is independent of t. This 
expression is sometimes referred to as the linear fluctuation dissipation relation or the linear fluctuation dissipation 
theorem. Integrating Q with respect to t between zero and infinity, we get 


B = - 


/~*oo ^ 

I C(t)C(0)^'£/t 

Jo I 


(7) 


which is sometimes referred to as the Gaussian fluctuation dissipation theorem (FDT) because it may be derived by 
assuming that the system in question has a Gaussian probability density function (PDF) instead of assuming the linear 
stochastic system 0 (e.g. m, 1251, 1261). Having found B, the noise covariance matrix Q may be found using the 
Lyapunov equation (e.g. 0) 

BC(0) -H C(0)B'^ -H Q = 0. (8) 


2 



1.3. Truncation of the data set 

In (j^ and Q the inverse of the covariance matrix C(0) ' appears. For a non-singular C(0) (so that its inverse 
can be found) it is required that the number of measurements in time, n, is greater than the dimensionality, d, of the 
data set, n > d. Thus, and somewhat paradoxically, the more data that is collected at each interval of time (large d), 
the longer the data must be collected for (large n). This is a fundamental problem with (|^ and (|^ and means that 
the length of time data is required to be recorded over, can be impossibly long. A less serious problem is that if two 
data points behave in a similar way, because they reflect two measurements of a similar physical quantity, then two 
of the rows of C(0) are also similar and it is close to singular. It may take a lot of data to accurately characterise the 
difference between the two points and achieve a numerically non-singular C(0). 

Another issue is the computational cost. Computation of the matrix inverse in (|^ and Q, the matrix logarithm 
in ^ and the general multiplication of B and C(0) in ^ each take of the order of d^ floating point operations on 
a computer (e.g. Il27ll . ll28l '). Having found B and Q, there is also the problem of numerical integration of the linear 
system ([^. For the evaluation of the eigenvalues and vectors of Q must be found o nce, taking of the order of d^ 


operations, and in general both the Bx and ^ terms take d^ operations per time step, see Appendix A Thus, for large 


d the computational cost of either finding B and Q or integrating Q becomes prohibitive. 

The solution to the problem of d being too large, is to truncate the data set to a lower dimensionality d', where d' 
is sufficiently small for practical use. This is typically achieved by finding the leading d' eigenvalues and eigenvectors 
of C(0) and setting the remaining d-d’ eigenvalues to zero. The data is then transformed into the space defined by 
the matrix of the leading d' eigenvectors. Vs, commonly referred to as Empirical Orthogonal Function (EOF) space 
(e.g. 1291). Giving 


Cs(f) - V,^C(f)Vs 


B, 


= VjBVs 


and 


Qs = V,^QVs 


(9) 


Here the subscript s denotes the truncated matrices. The problem of inverting the d' x d' matrix Cs(0) then requires 
that n > d', calculation of Bs and Qs takes of the order of d'^ operations and integration of the truncated version of 
Q takes of the order of d'^ operations per time step. An estimate of the full B and Q matrices may then be made by 
performing the inverse of (|9]|. The assumption is that the most important processes have the largest variance. Even if 
this is true, neglecting the least variable processes combined with inaccuracies in the estimation of Vs introduces bias 
into the estimation of B and Q. For a chaotic system where each point is governed by the same rules as its neighbours, 
truncation in EOF space may not be the most appropriate truncation to make. In this case a localised truncation can 
be optimal If30l . 


1.4. Paper Outline 

The purpose of this paper is to introduce two alternative methods that do not require truncation in EOF space, 
instead relying on the assumption of locality and using the fact that for many problems, B is sparse. Locality is 
defined by assuming that elements of C(t) relating two points with a distance greater than some critical value, may be 
set to zero. This is the same as assuming that if there are no significant correlations at a lag t between two variables, 
there is no evidence that they have any significant relation at this lag, and are therefore assumed to be effectively 
independent. With these assumptions, the accuracy of any estimate of B becomes independent of the dimensionality 
d of the state vector x. Practical results of this approach are that bias and smoothing due to EOE truncation are 
eliminated and that less data is required for a given accuracy. 

In sectionj^we describe two methods of finding a local B and Q that have an accuracy independent of d. In section 
l^we introduce a test model to evaluate the ability of the algorithms presented in section]^ In section]^ using limited 
data (often n d) from the test models, with d between 2 and 2'®, (65536), the performance of the algorithms is 
demonstrated. Our conclusions are described in section|5] 


2. The method 

2.1. The local Gaussian FDT 
We define 


A = 



T 


C(T)dT 


( 10 ) 
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and rearrange (0 to get 


( 11 ) 


AB’^ = -C(0). 

A row vector b, may be defined so that it contains the /th row and only the r non-zero columns of B. r is equal to 
the number of elements of B that exist in the expression for the right hand side of dxijdt, where xi denotes a single 
element of the vector x. A may be truncated to a matrix AJ that contains all d rows and r columns of A. The index 
of each included column corresponds to the non-zero columns of B in its /th row. Then for each column / = \...d of 
C(0), denoted c,, we may write 

A;b[ = -c,-. (12) 

For a finite number of data points, n, any estimate of C(t) or A using data, 0, will typically result in an overestimate 
of the magnitude of elements where the true value is sufficiently close to zero. Contribution of this overestimate to 
any resulting estimate of b, will in general depend upon d. Assuming that C(t) decays with distance, we pick the 
distances Ocut and Ccut above which the respective elements of A and C(0) are set to zero and call the truncated version 
of this matrix and vector A" and c". This gives 


A"b/ 


(13) 


All of the b,’s may then be found using a linear least squares fit via the singular value decomposition, and combined 
to give an estimate of B. We call this method the local Gaussian FDT because we have made the approximation 
that perturbations decay to zero after some distance. The local Gaussian FDT may be generalised to the non-linear, 
non-Gaussian case in the context of estimating the response to a forcing ESh . 


2.2. Local LIM 


If we instead make the approximation that information travels at a finite speed between different Xi, then after a 
sufficiently short time t the set of elements x"' of x that can possibly have an important influence upon x, are those 
corresponding to the non-zero elements of row / of B. As t increases, the number of elements of x that have an 
important influence upon x,- increases. We therefore assume that for small t, (j^ does not require information from the 
full covariance matrices C(0) and C(t) in order to accurately approximate B. The accuracy of this approximation for 
any given r depends upon B. To estimate row / of B the covariance matrices C'"(0) and C'"(t) of the time series x'" 
are required 

B'" = iiog[c:"(T)c;"(0)-']. (14) 


Then the b,, (defined in section 2.1 1 , used to estimate B, are given by the corresponding row of B'" for each i. We 
call this method local linear inverse modelling because we have made the approximation that perturbations can only 
be felt within a finite distance after a finite time. 


2.3. The noise covariance matrix. 

The time taken for a x dimensional matrix matrix multiplication is conventionally proportional to d^. However 
if B is sparse the time taken to find Q using 0 can be faster, proportional to d^. Q is in general dense, even if B is 
sparse, so finding Q may be problematic for extremely large d. It turns out that finding all elements of Q may not be 
necessary. If we assume that the process generating the noise ^ is somewhat local in nature, then both the number of 
elements of Q necessary for numerical integration of Q and the time taken to generate ^(f) at a particular time t is 
proportional to d, see the appendix. 


3. Test integrations 

We compare algorithms by testing them with a time series generated by a simple linear stochastic model. The 
model we use is a system of coupled linear equations 

djc' 

= (/7i,,-lXi_i -H bi^iXi + fi,-,+ iX,+i) + (15) 
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with i - I.. .d. bij represents the elements of the constant matrix B with bifl - bd,d+\ - 0 and all elements of B not 
included in 0 being zero. The deterministic part of this system is similar in nature to a discretised partial differential 
equation with one spatial and one temporal dimension. The constants bij are chosen randomly using the following 
expressions 


bid - ~ 1) Mi.l “ O', 

bid+l = (()0 - 1) Mi,2 - y6) bid, 
bid-1 = (O' - 1) m,'3 {bid + bid+i), 


1 < i < d, 
1 <i<d-l, 

2 < i < d. 


where Ui j represents a random number chosen from a uniform distribution between 0 and \,a - 0.5 and p — 0.35 are 
constants that govern the autocorrelation decay time and the coupling between state vector elements respectively. In 
our test case we assume that only the tridiagonal elements of Q are important. Then since Q is symmetric, only the 
diagonal and +1 off diagonal elements are required. For the hypothetical physical system that we are assuming, the 
other elements of Q add no additional useful information. The diagonal of Q is then chosen by 


1/2 
M ^ ('•u 






I <i < d. 


and the +1 off diagonal is given by 


qu+i = 2 1.1 + nan+hi + I’/si'i+i.s), 


l<i<d-l. 


where rjj represents a random number chosen from a Gaussian distribution with zero mean and unit variance. (15 i is 
integrated for each i using the Euler-Maruyama method 


= 2 :" + 


(^bid-ix"_i + bidx” + bid+ix"_,_i) At + VAt, I <i <d. 


where At - 0.01 is chosen as the time step, the state vector is recorded at each time step, and the values of are 
chosen given the method in the appendix. This test system is particularly simple, so in this case 




and 






€ 



2 < i < d 


qi-i,i—i y 1 

where tl/j are random numbers chosen from a Gaussian distribution with zero mean and unit variance. 

When applying the Gaussian FDT (j7]i and local Gaussian FDT, ( [T0) i and (131, to any of these data sets, an upper 
limit of the integral of 20 is chosen to approximate infinity. When applying LIM^, or local LIM ( |T4) i to any of these 
data sets, a value of r = 1 is chosen, n instances of the state vector at lag zero and lag one and the integral of the state 
vector are kept every 20 time units. All intermediate data and a spin up from t - -20 to f = 0 with random initial 
conditions, is discarded. 


4. Results 

4.1. Clipping the local Gaussian FDT 

For the local Gaussian FDT we need to truncate the matrix A from ( [Tol i and the covariance matrix C(0). If our 
physical understanding of the system is not sufficient then, since we have a noisy estimate of these matrices from the 
data, this cut-off can be chosen by looking at their structure and an add-hoc estimate of a zero correlation distance. To 
illustrate the gains in accurate estimation of B by assuming a cut-off, the root mean squared (RMS) error is plotted as 
a function of the clipping distance in figure[2 An alternative method, also plotted in figure[2 is to choose to cut-off at 
a particular correlation. We choose a minimum allowed correlation and set elements of A and C(0) to zero, where the 
correlation is below its cut-off value. Figure shows that the local Gaussian FDT performs a more accurate estimate 
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d=512, n=10^ d=512,n=10’ 



Figure 1: A plot of the RMS error in the diagonal of B as a function of the distance (left) and magnitude (right) of ignored congelations for the local 
Gaussian FDT with d = 512. For truncation with distance (left), the circles represent the ensemble mean over 100 independent data sets each of 
n = 1000 data points. The dashed lines denote the ensemble standard deviation multiplied by 2/ VlOO. For truncation by correlation, a single data 
set of « = 10^ instances of the state vector was used. 


for a moderate cut-off. Ignoring distances above 6 vector elements or correlations below 0.01 is close to optimal in 
this case. Some form of cross validation can be used to choose the cut-off when applying the local Gaussian FDT 
to a new data set. Similar results are obtained for the RMS error in the +1 off diagonals. The optimum truncation 
depends upon several factors and for simplicity we choose to truncate all lag-correlations (to zero) above a distance 
of 32 vector elements in all further applications of the local Gaussian FDT in this paper. 

4.2. Convergence with length of the data set 

Correlations of a stable linear stochastic system decay exponentially in time. So after some time the system is 
effectively independent of its initial state. Therefore the expected error in the mean calculated from a sample of size n 
of independent random numbers is proportional to 1 / sfn. For sufficient data, we expect the error in estimates of B and 
Q to also be proportional to 1/ y/n. This is examined in figure]^ which shows that given sufficient data, the error in all 
methods appears to decay approximately according to this rule. The error in the FDT demonstrates the quantity of data 
required before reasonable estimates can be obtained (around 2 to 3 x 10"^ data points for d — 512). In comparison, 
the local FDT has good performance for small data sets but is beaten by the FDT for large data sets. This rehects 
the fact that correlations beyond the 32 grid point clipping distance, are resolved. In this test LIM without truncation 
performs better than both forms of the FDT. Like the FDT, a certain quantity of data is required to dramatically reduce 
the error before Ij sfn behaviour is approximately reached. The local LIM method yields the smallest error, at large n 
having effectively the same accuracy as the LIM method without truncation. Similar results were obtained for the off 
diagonal elements of B and the elements of Q. 

4.3. Accuracy as a function of dimensionality 

For a limited data set of 1000 points in time, figure demonstrates the error of each method as a function of 
dimensionality d of the state vector x. The FDT and local FDT display similar performance for low dimensional 
{d < 10) systems, but the performance of the FDT becomes poor and unpredictable at higher dimensionality while 
the performance of the local FDT plateaus. Similar behaviour is observed for LIM and local LIM. It is clear that for 
> 100 the performance of the local FDT and local LIM is independent of d. Again, similar results were obtained 
for the off diagonal elements of B and the elements of Q. 
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d=512 



Figure 2: Accuracy of each method as a function of the time window over which data is available. Note the logarithmic axes. For these integrations 
d = 512. The circles and solid lines indicate the mean over an ensemble of 100 independent members. The dashed lines indicate the ensemble 
standard deviation multiplied by 2/ VlOO. 


n=1000 



Figure 3: The accuracy of each method as a function of dimensionality of the state vector. Note the logarithmic axes. Each estimate uses n = 1000 
instances of the state vector with the highest dimensionality tested being d = 65536 (= 2^^). The circles and solid lines indicate the mean over an 
ensemble of 100 independent members. The dashed lines indicate the ensemble standard deviation multiplied by 2/ VlOO. The highest dimensional 
points for the conventional FDT and LIM are omitted due to computational time and memory limitations. 
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n=10^ 



Figure 4: The time taken for each method per state vector element as a function of dimensionality of the state vector. Note the logarithmic axes. 
Calculations and timings were performed on a standard desktop computer and each estimate was performed with a single integration providing 
« = 10^ instances of the state vector. 

4.4. Computation time as a function of dimensionality 

Figurej^shows the time taken per variable for the application of each method as a function of dimensionality. The 
time taken for the Gaussian FDT is dominated by the matrix inverse operation and the time taken for LIM is dominated 
by the matrix logarithm operation. The time taken for either is proportional to using conventional algorithms. The 
time taken for their local counterparts is proportional to d in both cases. 

5. Conclusion 

We have presented two algorithms for finding the parameters governing a high dimensional linear system. We call 
them the local Gaussian fluctuation dissipation theorem (FDT) and local linear inverse modelling (LIM). The accuracy 
of these algorithms does not depend upon the dimensionality of the state vector and the time taken in practice for their 
computation is proportional to the number of state vector elements. We have tested our algorithm with linear stochastic 
systems of up to 2'® variables. Although the particular application in mind here is approximation of a turbulent fluid, 
we expect that this method can be applied in other contexts. 

The conventional method of dealing with high dimensional systems is to first make a truncation into some smaller 
space, for example EOF space. Unfortunately it may be the case that cut-off in the spatial spectrum of a turbulent 
fluid leads to less accurate predictions. The algorithms presented do not require such a truncation, even for extremely 
high dimensional systems. Instead, some kind of locality of the system needs to be assumed. For example, in a spatial 
discretisation of a held into a number of grid points, that reasonable perturbations to a single grid point only affect a 
limited number of local grid points after a small amount of time. After longer times, the fact that a perturbation can 
propagate over the entire domain does not reduce the accuracy of the local linear FDT. However, the accuracy and 
computation time of the local Gaussian FDT depends upon the number of grid points that the perturbation propagates 
over. As is standard with the discretisation of many partial differential equations, part of this assumption requires that 
the matrix B in equation 0 is sparse. 

In addition to the choice of a lag time that approximates infinity, required by the standard Gaussian FDT, the local 
Gaussian FDT requires the specification of the maximum distance that a perturbation can propagate before becoming 
insignificant. This quantity may typically be estimated by understanding of the physical system modelled and by 
looking at the spatial range of typical correlations in the data. For the local linear FDT, the state vector elements that 
a perturbation can reach after a sufficiently small time, for example one model time step, must also be provided. Both 
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Figure A.5: An example of what we mean by a direct link between the physical process that underly the white noise process. is directly linked 
by some physical process to ^ 2 - ^2 is directly linked to both and ^ 3 . ^3 is directly linked to ^ 2 - Although and ^3 may be correlated, it is only 
due to their shared correlation with ^ 2 - We assume that there is no additional process that links them providing additional congelation. They ai‘e 
therefore not directly linked and the con'elation between them provides no additional useful information. 


the standard linear FDT and the local linear FDT require the choice of a parameter that represents this sufficiently 
small time. 


Appendix A. Generation of spatially correlated random numbers. 

Numerical integration of a single realisation of 0 requires generation of correlated random numbers representing 
To do this in the conventional way, Q must be found using 0. The noise at time t may then be generated by 

^(f) = vVD|A(0 (A.l) 


where V is the matrix of the eigenvectors of Q, D is the corresponding diagonal matrix of eigenvalues, the square 
root is taken element wise and is a vector of Gaussian distributed independent random numbers with unit variance. 
However the computational ti me r equired to estimate V and D is proportional to cP and the computational time and 
memory required to evaluate (A. 1 1 is proportional to (P unless V has a simple structure. By making the assumption 
that element j of ^j, is only directly linked to pj other elements, the computational time and memory required may 
be reduced to be proportional to p] ^nd pj respectively. It is possible that correlations between and ^3 
are entirely caused by both of them being correlated with ^ 2 - So in this context “direct link” means that correlations 
between and ^3 do not require another see hgure [A3| Which of the elements of ^ are directly linked to each 
other is a necessary assumption and may be justified, for example, by understanding of the physical process modelled. 


Appendix A.l. A faster method 

The first element of fi, may be generated at any time by 

flit) = y/oiffiit) (A.2) 

where fft) is an element of ^(f), a Gaussian distributed random number with unit variance and zero temporal correla¬ 
tion. The constant ai j is the variance of f\ given by a\ \ - q\^i with qtj representing element i, j of Q. If correlations 
in the next point, f 2 , with are caused by a direct link between the physical processes governing these two terms 
then 

flit) - ai.ifiit) + yfoiffiit)- 

Here ^2 has a component correlated with f\ with a magnitude given by the constant ai^i and a component that is 
independent of fi with a magnitude given by 02 , 2 - ^ 1,2 must be chosen so that the covariance of fi and ^2 equals qi_ 2 - 
Therefore 


(fif2) = qi.2 
(^1 (ai,2^i + ^fap2f2)) = ^1,2 
ai,2 = qi.2 
ai,2qi,i - qi,2- 
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fl 2,2 must now be chosen so that the variance of ^2 is equal to ^ 2 , 2 , 


(^ 2 ^ 2 ) - q2,2 

((ai, 2^1 + V“ 2 , 2 <A 2 ) (ai, 2^1 + V‘ 22 , 2 </' 2 )) = ? 2,2 

+ 2 ^ 1,2 V'^ 2 , 2 ^ 1 'A 2 + ^2,2^^ - ? 2,2 

a\,2{^\) + =? 2,2 

2 

^1,2^1,1 + ^2,2 - q2,2- 


Therefore 


2 

02,2 - g' 2,2 - Oi_ 2 ?l,l- 

Written as a matrix equation and noting that , = qp, aij and ( 22,2 may be found by solving 


/ q' 1,1 0 \ / ai ,2 
\ ? 1,2 1 / \ 02,2 


If the element ^3 is directly linked with f 1 and ^2 then 


? 1,2 \ 
q2,2 ) 


^3(0 - Oi, 3 fl(f) + 02,3^2(0 + 2 ja 3 , 3 lj^ 2 (t) 

and the same reasoning as before leads to the matrix equation 


^1,1 ^2,1 0 

' ai ,3 


' qi ,3 

qi .2 q 2,2 0 

02,3 


q 2,3 

qi .3 q 2,3 1 , 

. 03,3 , 


. q 3,3 , 


which must be solved for 02,3 and 033 . Similarly if ^4 is directly linked with ^ 1 , ^2 and ^3 then 


^4(0 - ai, 4 ^i(f) + 02,4^2(0 + 03,4^3(0 + y[a^tf//[{t) 

and we get 


qi,i q2,i q3,i 0 


' < 21,4 ' 


' qi,4 ' 

q\,2 q2,2 q3,2 0 


^2,4 


q2,4 

q\,3 q2,3 q3,3 0 


03,4 


q3,4 

qi,4 q2,4 q3,4 l J 


, 224,4 J 


\ q4,4 / 


On the other hand, if ^ 4(0 is not directly related to only being directly related to ^ 2(0 and ^ 3 ( 0 , then 


^4(0 - 02,4^2(0 + 03,4^3(0 + ^04,41/^4(0 


(A.3) 


q2,2 q3,2 0 

222,4 


' q2,4 

q2,3 q3,3 0 

223,4 


q3,4 

q2,4 q3,4 1 , 

, ^4,4 , 


. ?4,4 , 


The precise numerical values of the indices may be altered to suit a particular problem without difficulty. Although 
we have only made use of a subset of the elements of Q, it is not sparse. If necessary, any of the remaining elements 
of Q may be found in terms of elements already found. For example q^A, using equation (A.3 1 is given by 


q\A - (^ 1 ^ 4 ) 

= (^1 (^2,4^2 + 03,4^3 + V‘^4,4lA4)) 
= ^2,4 (^ 1 ^ 2 ) + ^3,4 
= a2,4qi,2 + a3,4?l,3- 


Note that these values, calculated for remaining undefined elements of Q, are only valid for the specific order of the 
equations used to generate If a different element of ^ is used as a starting point (A.2 1 , then in general different 
values of the undefined elements of Q may be found. We are assuming that either the differences are not important or 
that that the starting point is somehow correctly chosen. 
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